Phonetic Alignment and Similarity
نویسنده
چکیده
The computation of the optimal phonetic alignment and the phonetic similarity between words is an important step in many applications in computational phonology, including dialectometry. After discussing several related algorithms, I present a novel approach to the problem that employs a scoring scheme for computing phonetic similarity between phonetic segments on the basis of multivalued articulatory phonetic features. The scheme incorporates the key concept of feature salience, which is necessary to properly balance the importance of various features. The new algorithm combines several techniques developed for sequence comparison: an extended set of edit operations, local and semiglobal modes of alignment, and the capability of retrieving a set of near-optimal alignments. On a set of 82 cognate pairs, it performs better than comparable algorithms reported in the literature.
منابع مشابه
A New Algorithm for the Alignment of Phonetic Sequences
Alignment of phonetic sequences is a necessary step in many applications in computational phonology. After discussing various approaches to phonetic alignment, I present a new algorithm that combines a number of techniques developed for sequence comparison with a scoring scheme for computing phonetic similarity on the basis of multivalued features. The algorithm performs better on cognate align...
متن کاملA measure of phonetic similarity to quantify pronunciation variation by using ASR technology
It attracts researchers’ interest how to define a quantitative measure of phonetic similarity between IPA transcripts of the same sentence read by two speakers. This problem can be divided into how to align two transcripts and how to quantify alignment gap. In this paper, we introduce a method of similarity calculation using phone-based or phoneme-based acoustic models trained with the algorith...
متن کاملString Similarity Measures and PAM-like Matrices for Cognate Identification
We present a new automatic learning system for the identification of cognates, words that derive from a common ancestor and share the same etymological origin. Our approach combines and adapts several techniques developed for biological sequence analysis to the natural language processing environment. We design a linguistic-inspired matrix to align sensibly our training dataset. We introduce a ...
متن کاملSegment Boundaries in Low Latency Phonetic Recognition
This study analyses how the reduction of the look-ahead length of a two pass phonetic decoder influences the alignment of the segment boundaries. It is shown how the optimization of some tuning parameters, such as the insertion penalty, is dependent on the look-ahead length. It is also suggested that the insertion penalty be dynamically adjusted to some measure of similarity of the phonetic seg...
متن کاملEstimating and visualizing language similarities using weighted alignment and force-directed graph layout
The paper reports several studies about quantifying language similarity via phonetic alignment of core vocabulary items (taken from Wichman’s Automated Similarity Judgement Program data base). It turns out that weighted alignment according to the Needleman-Wunsch algorithm yields best results. For visualization and data exploration purposes, we used an implementation of the Fruchterman-Reingold...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computers and the Humanities
دوره 37 شماره
صفحات -
تاریخ انتشار 2003